Physician and Other Supplier Data

The Physician and Other Supplier Public Use File (Physician and Other Supplier PUF) provides information on services and procedures provided to Medicare beneficiaries by physicians and other healthcare professionals. The Physician and Other Supplier PUF contains information on utilization, payment (allowed amount and Medicare payment), and submitted charges organized by National Provider Identifier (NPI), Healthcare Common Procedure Coding System (HCPCS) code, and place of service. This PUF is based on information from CMS’s National Claims History Standard Analytic Files. The data in the Physician and Other Supplier PUF covers calendar year 2012 and contains 100% final-action physician/supplier Part B non-institutional line items for the Medicare fee-for-service population.

While the Physician and Other Supplier PUF has a wealth of information on payment and utilization for Medicare Part B services, the dataset has a number of limitations. Of particular importance is the fact that the data may not be representative of a physician’s entire practice as it only includes information on Medicare fee-for-service beneficiaries. In addition, the data are not intended to indicate the quality of care provided and are not risk-adjusted to account for differences in underlying severity of disease of patient populations. For additional limitations, please review the methodology document available below.

Description source:https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Physician-and-Other-Supplier2016.html

Data source:https://data.cms.gov/browse?q=Medicare%20Provider%20Utilization&sortBy=relevance

[1]:
cd
C:\Users\jerem\Documents\Stata Class\code

Importing

You can input the url of the dataset and import directly.

[ ]:
*import delimited "https://data.cms.gov/api/views/sk9b-znav/rows.csv?accessType=DOWNLOAD"

But I already have the dataset, so we will import from the local file.

[2]:
local filepath "C:\Users\jerem\Box Sync\Geriatrics\KO\input_data\Medicare_PUF"
import delimited "`filepath'\Medicare_Provider_Utilization_and_Payment_Data__Physician_and_Other_Supplier_PUF_CY2015.csv"


(26 vars, 9,497,892 obs)

local is a macro Stata command that allows you to store strings to local macro names. “A macro has a macro name and macro contents. Everywhere a punctuated macro name appears in a command— punctuation is defined below—the macro contents are substituted for the macro name.”

Source: https://www.stata.com/manuals13/pmacro.pdf

We set seed so that we can reproduce this random sample again. 3535 is just a number I picked.

[3]:
set seed 3535

This file is large. But since this is an exercise, we will take a small sample to work with.

sample is a Stata command that allows you to randomly keep a certain percentage of the data. So sample 10, samples 10% of the data.

[4]:
sample 10
(8,548,103 observations deleted)
[5]:
describe

Contains data
  obs:       949,789
 vars:            26
 size:   578,421,501
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
nationalprovi~r long    %12.0g                National Provider Identifier
lastnameorgan~o str70   %70s                  Last Name/Organization Name of the Provider
firstnameofth~r str20   %20s                  First Name of the Provider
middleinitial~r str1    %9s                   Middle Initial of the Provider
credentialsof~r str20   %20s                  Credentials of the Provider
genderofthepr~r str1    %9s                   Gender of the Provider
entitytypeoft~r str1    %9s                   Entity Type of the Provider
s~1oftheprovi~r str55   %55s                  Street Address 1 of the Provider
s~2oftheprovi~r str55   %55s                  Street Address 2 of the Provider
cityoftheprov~r str31   %31s                  City of the Provider
zipcodeofthep~r str12   %12s                  Zip Code of the Provider
statecodeofth~r str2    %9s                   State Code of the Provider
countrycodeof~r str2    %9s                   Country Code of the Provider
providertype    str43   %43s                  Provider Type
medicareparti~r str1    %9s                   Medicare Participation Indicator
placeofservice  str1    %9s                   Place of Service
hcpcscode       str5    %9s                   HCPCS Code
hcpcsdescript~n str256  %256s                 HCPCS Description
hcpcsdrugindi~r str1    %9s                   HCPCS Drug Indicator
numberofservi~s float   %9.0g                 Number of Services
numberofmedic~s long    %12.0g                Number of Medicare Beneficiaries
numberofdisti~i long    %12.0g                Number of Distinct Medicare Beneficiary/Per Day Services
average~damount float   %9.0g                 Average Medicare Allowed Amount
averagesubmit~t float   %9.0g                 Average Submitted Charge Amount
average~tamount float   %9.0g                 Average Medicare Payment Amount
averagemedica~n float   %9.0g                 Average Medicare Standardized Amount
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sorted by:
     Note: Dataset has changed since last saved.

Motivation

Question 1: How many physicians/organizations are there in each Stata in the random sample? Question 2: How much did each of the physicians/organizations in the random sample make? Question 3: Which providers provided Home Based Medical Care?

Question 1

Is the data unique by providers?

[7]:
codebook nationalprovideridentifier

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
nationalprovideridentifier                                                                                                                                                   National Provider Identifier
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

                  type:  numeric (long)

                 range:  [1.003e+09,1.993e+09]        units:  1
         unique values:  475,439                  missing .:  0/949,789

                  mean:   1.5e+09
              std. dev:   2.9e+08

           percentiles:        10%       25%       50%       75%       90%
                           1.1e+09   1.2e+09   1.5e+09   1.7e+09   1.9e+09

There are 475,439 providers/physicians in the random sample.

Question 2

[6]:
sum averagemedicarepaymentamount, detail

               Average Medicare Payment Amount
-------------------------------------------------------------
      Percentiles      Smallest
 1%     .6027273              0
 5%     3.518688              0
10%     6.949286       .0006613       Obs             949,789
25%     18.46324       .0028188       Sum of Wgt.     949,789

50%     45.83917                      Mean           76.92642
                        Largest       Std. Dev.      195.5731
75%     85.54122       20864.97
90%     154.0762       23352.66       Variance       38248.86
95%     198.7759       28177.47       Skewness       40.83472
99%     648.9693       28266.43       Kurtosis       3300.137

We can see that there are 475,439 unique values in nationalprovideridentifier. But there are 949,789 observations. So we can say that the data is not unique by providers. Therefore, in order find out how much does each provider in the sample makes, we will need to collapse the data by providers.

[8]:
gen totalmedicarepaymentamount = numberofmedicarebeneficiaries*averagemedicarepaymentamount
[16]:
preserve
[17]:
collapse (sum)totalmedicarepaymentamount (first) firstnameoftheprovider lastnameorganizationnameofthepro, by(nationalprovideridentifier)
sum totalmedicarepaymentamount,
list nationalprovideridentifier firstnameoftheprovider lastnameorganizationnameofthepro if totalmedicarepaymentamount == r(max)



    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
totalmedic~t |    475,439    10389.15    99054.22          0   4.70e+07


        +-----------------------------------------------------+
        | national~r   firstn~r   lastnameorganizationnameo~o |
        |-----------------------------------------------------|
192981. | 1407855240              ROCKY MOUNTAIN HOLDINGS LLC |
        +-----------------------------------------------------+

Rocky Mountain Holdings LLC made the most in this random sample.

[18]:
histogram totalmedicarepaymentamount
(bin=56, start=0, width=839759.37)


[23]:
sum totalmedicarepaymentamount, detail

              (sum) totalmedicarepaymentamount
-------------------------------------------------------------
      Percentiles      Smallest
 1%        55.86              0
 5%       247.43       .0072745
10%       468.12       .0478853       Obs             475,439
25%     1226.516            .11       Sum of Wgt.     475,439

50%     3457.588                      Mean           10389.15
                        Largest       Std. Dev.      99054.22
75%     9697.688       1.46e+07
90%     22315.24       1.62e+07       Variance       9.81e+09
95%     35582.24       2.18e+07       Skewness         281.06
99%     87061.88       4.70e+07       Kurtosis       115756.7
[19]:
gen ltotalmedicarepaymentamount = log(totalmedicarepaymentamount)
(1 missing value generated)
[20]:
label variable ltotalmedicarepaymentamount "log of totalmedicarepaymentamount"
[22]:
histogram ltotalmedicarepaymentamount, normal
(bin=56, start=-4.9233861, width=.40338585)


[24]:
restore

Question 3

We have to find the HCPCS codes that correspond to Home base medical care services in order to identify them. HCPCS stands for Healthcare Common Procedure Coding System.

They are used for billing Medicare & Medicaid patients — The Healthcare Common Prodecure Coding System (HCPCS) is a collection of codes that represent procedures, supplies, products and services which may be provided to Medicare beneficiaries and to individuals enrolled in private health insurance programs. HCPCS codes primarily correspond to services, procedures, and equipment not covered by CPT® codes. This includes durable medical equipment (DME), prosthetics, ambulance rides, and certain drugs and medicines. Source 1: https://hcpcs.codes/

We found this page that lists the HCPCS codes that refer to home based medical care services. Source 2: https://www.lilesparker.com/2017/06/20/audits-medicare-em-home-services/

[26]:
gen hbmc = hcpcscode == "99341" | ///
hcpcscode == "99342" | ///
hcpcscode == "99343" | ///
hcpcscode == "99344" | ///
hcpcscode == "99345" | ///
hcpcscode == "93347" | ///
hcpcscode == "99348" | ///
hcpcscode == "99349" | ///
hcpcscode == "99350"
[27]:
tab hbmc

       hbmc |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |    948,260       99.84       99.84
          1 |      1,529        0.16      100.00
------------+-----------------------------------
      Total |    949,789      100.00

Let us see how many providers provided home based medicare care.

[28]:
codebook nationalprovideridentifier if hbmc == 1

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
nationalprovideridentifier                                                                                                                                                   National Provider Identifier
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

                  type:  numeric (long)

                 range:  [1.003e+09,1.993e+09]        units:  1
         unique values:  1,417                    missing .:  0/1,529

                  mean:   1.5e+09
              std. dev:   2.9e+08

           percentiles:        10%       25%       50%       75%       90%
                           1.1e+09   1.3e+09   1.5e+09   1.7e+09   1.9e+09

1,417 providers